AITopics | Osijek-Baranja County

Collaborating Authors

Osijek-Baranja County

Improving Transfer Learning for Sequence Labeling Tasks by Adapting Pre-trained Neural Language Models

arXiv.org Artificial IntelligenceOct-24-2025

This doctoral thesis improves the transfer learning for sequence labeling tasks by adapting pre-trained neural language models. The proposed improvements in transfer learning involve introducing a multi-task model that incorporates an additional signal, a method based on architectural modifications in autoregressive large language models, and a sequence labeling framework for autoregressive large language models utilizing supervised in-context fine-tuning combined with response-oriented adaptation strategies. The first improvement is given in the context of domain transfer for the event trigger detection task. The domain transfer of the event trigger detection task can be improved by incorporating an additional signal obtained from a domain-independent text processing system into a multi-task model. The second improvement involves modifying the model's architecture. For that purpose, a method is proposed to enable bidirectional information flow across layers of autoregressive large language models. The third improvement utilizes autoregressive large language models as text generators through a generative supervised in-context fine-tuning framework. The proposed model, method, and framework demonstrate that pre-trained neural language models achieve their best performance on sequence labeling tasks when adapted through targeted transfer learning paradigms.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2510.20033

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(43 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine (1.00)
Government (1.00)
Banking & Finance > Economy (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Interaction-Aware Model Predictive Decision-Making for Socially-Compliant Autonomous Driving in Mixed Urban Traffic Scenarios

Varga, Balint, Brand, Thomas, Schmitz, Marcus, Hashemi, Ehsan

arXiv.org Artificial IntelligenceFeb-21-2025

This paper presents the experimental validation of an interaction-aware model predictive decision-making (IAMPDM) approach in the course of a simulator study. The proposed IAMPDM uses a model of the pedestrian, which simultaneously predicts their future trajectories and characterizes the interaction between the pedestrian and the automated vehicle. The main benefit of the proposed concept and the experiment is that the interaction between the pedestrian and the socially compliant autonomous vehicle leads to smoother traffic. Furthermore, the experiment features a novel human-in-the-decision-loop aspect, meaning that the test subjects have no expected behavior or defined sequence of their actions, better imitating real traffic scenarios. Results show that intention-aware decision-making algorithms are more effective in realistic conditions and contribute to smoother traffic flow than state-of-the-art solutions. Furthermore, the findings emphasize the crucial impact of intention-aware decision-making on autonomous vehicle performance in urban areas and the need for further research.

algorithm, scenario, vehicle, (14 more...)

arXiv.org Artificial Intelligence

2503.01852

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
(12 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

Decision-Focused Learning for Complex System Identification: HVAC Management System Application

Favaro, Pietro, Toubeau, Jean-François, Vallée, François, Dvorkin, Yury

arXiv.org Artificial IntelligenceJan-24-2025

As opposed to conventional training methods tailored to minimize a given statistical metric or task-agnostic loss (e.g., mean squared error), Decision-Focused Learning (DFL) trains machine learning models for optimal performance in downstream decision-making tools. We argue that DFL can be leveraged to learn the parameters of system dynamics, expressed as constraint of the convex optimization control policy, while the system control signal is being optimized, thus creating an end-to-end learning framework. This is particularly relevant for systems in which behavior changes once the control policy is applied, hence rendering historical data less applicable. The proposed approach can perform system identification - i.e., determine appropriate parameters for the system analytical model - and control simultaneously to ensure that the model's accuracy is focused on areas most relevant to control. Furthermore, because black-box systems are non-differentiable, we design a loss function that requires solely to measure the system response. We propose pre-training on historical data and constraint relaxation to stabilize the DFL and deal with potential infeasibilities in learning. We demonstrate the usefulness of the method on a building Heating, Ventilation, and Air Conditioning day-ahead management system for a realistic 15-zone building located in Denver, US. The results show that the conventional RC building model, with the parameters obtained from historical data using supervised learning, underestimates HVAC electrical power consumption. For our case study, the ex-post cost is on average six times higher than the expected one. Meanwhile, the same RC model with parameters obtained via DFL underestimates the ex-post cost only by 3%.

artificial intelligence, machine learning, optimization problem, (13 more...)

arXiv.org Artificial Intelligence

2501.14708

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Colorado > Denver County > Denver (0.04)
North America > United States > New York > New York County > New York City (0.04)
(10 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Energy > Power Industry (1.00)
Construction & Engineering > HVAC (0.75)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

An Improved Approach for Cardiac MRI Segmentation based on 3D UNet Combined with Papillary Muscle Exclusion

Benameur, Narjes, Mahmoudi, Ramzi, Deriche, Mohamed, fayouka, Amira, Masmoudi, Imene, Zoghlami, Nessrine

arXiv.org Artificial IntelligenceOct-9-2024

Left ventricular ejection fraction (LVEF) is the most important clinical parameter of cardiovascular function. The accuracy in estimating this parameter is highly dependent upon the precise segmentation of the left ventricle (LV) structure at the end diastole and systole phases. Therefore, it is crucial to develop robust algorithms for the precise segmentation of the heart structure during different phases. Methodology: In this work, an improved 3D UNet model is introduced to segment the myocardium and LV, while excluding papillary muscles, as per the recommendation of the Society for Cardiovascular Magnetic Resonance. For the practical testing of the proposed framework, a total of 8,400 cardiac MRI images were collected and analysed from the military hospital in Tunis (HMPIT), as well as the popular ACDC public dataset. As performance metrics, we used the Dice coefficient and the F1 score for validation/testing of the LV and the myocardium segmentation. Results: The data was split into 70%, 10%, and 20% for training, validation, and testing, respectively. It is worth noting that the proposed segmentation model was tested across three axis views: basal, medio basal and apical at two different cardiac phases: end diastole and end systole instances. The experimental results showed a Dice index of 0.965 and 0.945, and an F1 score of 0.801 and 0.799, at the end diastolic and systolic phases, respectively. Additionally, clinical evaluation outcomes revealed a significant difference in the LVEF and other clinical parameters when the papillary muscles were included or excluded.

dice metric, papillary muscle, segmentation, (13 more...)

arXiv.org Artificial Intelligence

2410.06818

Country:

Africa > Middle East > Tunisia > Tunis Governorate > Tunis (0.25)
Europe > Switzerland > Basel-City > Basel (0.04)
Asia > Middle East > UAE > Ajman Emirate > Ajman (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)

Add feedback

The use of GPT-4o and Other Large Language Models for the Improvement and Design of Self-Assessment Scales for Measurement of Interpersonal Communication Skills

Bubaš, Goran

arXiv.org Artificial IntelligenceSep-21-2024

OpenAI's ChatGPT (GPT-4 and GPT-4o) and other Large Language Models (LLMs) like Microsoft's Copilot, Google's Gemini 1.5 Pro, and Antrophic's Claude 3.5 Sonnet can be effectively used in various phases of scientific research. Their performance in diverse verbal tasks and reasoning is close to or above the average human level and rapidly increasing, providing those models with a capacity that resembles a relatively high level of theory of mind. The current ability of LLMs to process information about human psychology and communication creates an opportunity for their scientific use in the fields of personality psychology and interpersonal communication skills. This article illustrates the possible uses of GPT-4o and other advanced LLMs for typical tasks in designing self-assessment scales for interpersonal communication skills measurement like the selection and improvement of scale items and evaluation of content validity of scales. The potential for automated item generation and application is illustrated as well. The case study examples are accompanied by prompts for LLMs that can be useful for these purposes. Finally, a summary is provided of the potential benefits of using LLMs in the process of evaluation, design, and improvement of interpersonal communication skills self-assessment scales.

communication skill, gpt-4o, llm, (15 more...)

arXiv.org Artificial Intelligence

2409.1405

Country:

Europe > Austria > Vienna (0.14)
Europe > Croatia > Zagreb County > Zagreb (0.04)
Asia > Singapore (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Education > Assessment & Standards (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

Compressing Sentence Representation with maximum Coding Rate Reduction

Ševerdija, Domagoj, Prusina, Tomislav, Jovanović, Antonio, Borozan, Luka, Maltar, Jurica, Matijević, Domagoj

arXiv.org Artificial IntelligenceApr-25-2023

In most natural language inference problems, sentence representation is needed for semantic retrieval tasks. In recent years, pre-trained large language models have been quite effective for computing such representations. These models produce high-dimensional sentence embeddings. An evident performance gap between large and small models exists in practice. Hence, due to space and time hardware limitations, there is a need to attain comparable results when using the smaller model, which is usually a distilled version of the large language model. In this paper, we assess the model distillation of the sentence representation model Sentence-BERT by augmenting the pre-trained distilled model with a projection layer additionally learned on the Maximum Coding Rate Reduction (MCR2)objective, a novel approach developed for general-purpose manifold clustering. We demonstrate that the new language model with reduced complexity and sentence embedding size can achieve comparable results on semantic retrieval benchmarks.

artificial intelligence, natural language, text processing, (16 more...)

arXiv.org Artificial Intelligence

2304.12674

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

sEMG-Based Upper Limb Movement Classifier: Current Scenario and Upcoming Challenges

Cagliari Tosin, Maurício (a:1:{s:5:"en_US";s:41:"Universidade Federal do Rio Grande do Sul";}) | Machado, Juliano Costa | Balbinot, Alexandre

Journal of Artificial Intelligence ResearchSep-19-2022

Despite achieving accuracies higher than 90% on recognizing upper-limb movements through sEMG (surface Electromyography) signal with the state of art classifiers in the laboratory environment, there are still issues to be addressed for a myo-controlled prosthesis achieve similar performance in real environment conditions. Thereby, the main goal of this review is to expose the latest researches in terms of strategies in each block of the system, giving a global view of the current state of academic research. A systematic review was conducted, and the retrieved papers were organized according to the system step related to the proposed method. Then, for each stage of the upper limb motion recognition system, the works were described and compared in terms of strategy, methodology and issue addressed. An additional section was destined for the description of works related to signal contamination that is often neglected in reviews focused on sEMG based motion classifiers. Therefore, this section is the main contribution of this paper. Deep learning methods are a current trend for classification stage, providing strategies based on time-series and transfer learning to address the issues related to limb position, temporal/inter-subject variation, and electrode displacement. Despite the promising strategies presented for contaminant detection, identification, and removal, there are still some factors to be considered, such as the occurrence of simultaneous contaminants.

algorithm, classifier, international conference, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13999

AI Access Foundation

13999

Journal of Artificial Intelligence Research

Country:

Europe > Switzerland (0.04)
Europe > Italy > Sardinia > Cagliari (0.04)
Europe > Germany > Berlin (0.04)
(31 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.92)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Energy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bridging the gender digital divide: AI Hackathon with Microsoft supports girls' digital skills

#artificialintelligenceJun-5-2021, 19:55:11 GMT

Or so Hesme, aged 15, believed when she switched schools in 10th grade. "I thought I'd be terrible at it", she says. When she moved to Curro Heritage House High School, STEM classes were a regular part of the curriculum. She was nervous about that – but when her brother dared her to take a computer science class, she accepted the challenge to prove him wrong. Hesme loved her computer science class.

ai hackathon, digital skill, gender digital, (10 more...)

#artificialintelligence

Country:

Africa > South Africa (0.06)
Europe > Middle East (0.05)
Europe > Croatia > Osijek-Baranja County > Osijek (0.05)
(2 more...)

Genre: Contests & Prizes (0.48)

Industry:

Education > Curriculum > Subject-Specific Education (0.58)
Commercial Services & Supplies > Security & Alarm Services (0.56)
Health & Medicine > Therapeutic Area (0.39)
Education > Educational Setting (0.38)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Explaining Predictions of Deep Neural Classifier via Activation Analysis

Stano, Martin, Benesova, Wanda, Martak, Lukas Samuel

arXiv.org Artificial IntelligenceDec-3-2020

In many practical applications, deep neural networks have been typically deployed to operate as a black box predictor. Despite the high amount of work on interpretability and high demand on the reliability of these systems, they typically still have to include a human actor in the loop, to validate the decisions and handle unpredictable failures and unexpected corner cases. This is true in particular for failure-critical application domains, such as medical diagnosis. We present a novel approach to explain and support an interpretation of the decision-making process to a human expert operating a deep learning system based on Convolutional Neural Network (CNN). By modeling activation statistics on selected layers of a trained CNN via Gaussian Mixture Models (GMM), we develop a novel perceptual code in binary vector space that describes how the input sample is processed by the CNN. By measuring distances between pairs of samples in this perceptual encoding space, for any new input sample, we can now retrieve a set of most perceptually similar and dissimilar samples from an existing atlas of labeled samples, to support and clarify the decision made by the CNN model. Possible uses of this approach include for example Computer-Aided Diagnosis (CAD) systems working with medical imaging data, such as Magnetic Resonance Imaging (MRI) or Computed Tomography (CT) scans. We demonstrate the viability of our method in the domain of medical imaging for patient condition diagnosis, as the proposed decision explanation method via similar ground truth domain examples (e.g. from existing diagnosis archives) will be interpretable by the operating medical personnel. Our results indicate that our method is capable of detecting distinct prediction strategies that enable us to identify the most similar predictions from an existing atlas.

dataset, prediction basis, representation, (15 more...)

arXiv.org Artificial Intelligence

2012.02248

Country:

Europe > Slovakia > Bratislava > Bratislava (0.04)
Europe > Austria > Upper Austria > Linz (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Croatia > Osijek-Baranja County > Osijek (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Explainable Artificial Intelligence: a Systematic Review

Vilone, Giulia, Longo, Luca

arXiv.org Artificial IntelligenceJun-4-2020

This has led to the development of a plethora of domain-dependent and context-specific methods for dealing with the interpretation of machine learning (ML) models and the formation of explanations for humans. Unfortunately, this trend is far from being over, with an abundance of knowledge in the field which is scattered and needs organisation. The goal of this article is to systematically review research works in the field of XAI and to try to define some boundaries in the field. From several hundreds of research articles focused on the concept of explainability, about 350 have been considered for review by using the following search methodology. In a first phase, Google Scholar was queried to find papers related to "explainable artificial intelligence", "explainable machine learning" and "interpretable machine learning". Subsequently, the bibliographic section of these articles was thoroughly examined to retrieve further relevant scientific studies. The first noticeable thing, as shown in figure 2 (a), is the distribution of the publication dates of selected research articles: sporadic in the 70s and 80s, receiving preliminary attention in the 90s, showing raising interest in 2000 and becoming a recognised body of knowledge after 2010. The first research concerned the development of an explanation-based system and its integration in a computer program designed to help doctors make diagnoses [3]. Some of the more recent papers focus on work devoted to the clustering of methods for explainability, motivating the need for organising the XAI literature [4, 5, 6].

machine learning, natural language, neural information processing system, (17 more...)

arXiv.org Artificial Intelligence

2006.00093

Country:

North America > United States > California > San Francisco County > San Francisco (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > New York > New York County > New York City (0.14)
(90 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
(5 more...)

Add feedback